Image analytics for crowdsourced photographs: no-code supervised and unsupervised classification solutions

Oleksandr Karasov (University of Helsinki, Digital Geography Lab, Helsinki Institute of Urban and Regional Studies), Evelyn Uuemaa (University of Tartu, Department of Geography)

Challenge

Geographers usually deal with geospatial raster or vector data. Photographs uploaded by users to social media - such as Flickr or Twitter - sometimes also have geographic coordinates in their metadata and can be called Volunteered Geographic Information. Geographic applications of such photographs may include land use/land cover classification, disaster management, mapping urban activities and structure, analysing tourists’ profiles, etc. In one way or another, ground-based photographs often contain helpful information, which may contribute to spatial analysis and build a narrative about some phenomenon. In case there are thousands of pictures available for one’s case study, automated methods of image analytics become handy - which, though, maybe tough to use without advanced data science skills. In practice, the learning curve of automated image analytics can be relatively high for geographers and landscape scientists.

How do Orange and Lobe software democratise image analytics?

Fortunately, thanks to democratised data science tools, quick image analytics have become possible. This workshop introduces two solutions for no-code image analytics: unsupervised clustering in Orange [1] software (developed by Bioinformatics Lab at University of Ljubljana, Slovenia) and deep learning-based supervised classification implemented in Lobe software (developed by Microsoft). Therefore, researchers with no data science background, students, and practitioners in landscape and urban planning may benefit from big data analytics in their everyday work. Although both Orange and Lobe do not require coding skills as long as they have a Graphical User Interface, the workflow for unsupervised clustering can be exported in Orange-specific format for reproducibility. Similarly, Lobe’s machine learning image classification model can be exported in the standard open-source TensorFlow format and applied to the independent dataset. Lobe is based on two convolutional neural networks: “ResNet-50 V2” and “MobileNet V2” - the last one is the fastest by the expense of accuracy, while ResNet-50 V2 provides better accuracy of classification, enabled by default.

Case study

For this workshop, we will use the dataset of publicly available Flickr images, downloaded via the official Flickr Application Programming Interface with the fantastic photosearcher R package by Fox et al. [2] within the territory of Tartu city (Figure 1). Tartu is the 2nd largest city in Estonia, so its diverse landscape and people provide a rich data source for testing our approach. In our case study, we would like to find general categories of the urban motifs shared by different Flickr users without any a priori knowledge in this regard. Figure 1. Tartu as it appears in Google Maps

Figure 1. Tartu city as it appears in Google Maps

For our workshop, we pre-downloaded all the photographs taken in this location. This dataset (folder Flickr corpus) contains 9455 geolocated pictures taken during the last ten years in Tartu. Please, download and save this dataset on your local PC. Important: this dataset contains personal data of Flickr users, despite all the photographs being publicly available. Dataset via the link above is compiled for educational purposes only and will be removed after the workshop is done. Please, also eliminate all the photographs from your PC immediately after the workshop!

Downloading and installing software

Orange

Let’s download Orange via the following link. You can choose between a standalone installer and a portable version. Please note that the standalone version was used in the following examples. Install Orange, accepting the defaults. Run Orange and under Options - Add-ons, choose Image Analytics add-on. Click OK and wait until the Image Analytics add-on is installed (Figure 2). Restart Orange upon successful add-on installation. Figure 2. Expanding Orange functionality with Image Analytics

Figure 2. Expanding Orange functionality with Image Analytics

Lobe

Lobe is available for download via the following link. Please, also install Lobe, accepting the defaults. Lobe is complemented with additional Image Tools software to apply the classifier you trained to another dataset (transfer learning). Image Tools can also collect data from the Flickr photo repository via the official API. Image Tools are available via the GitHub repository.

Unsupervised image clustering with Orange

Orange is built on top of Python installation and can be run with the command line. However, it also has a convenient GUI, organised as a workflow. When you have the ‘Image Analytics’ add-on installed, proceed with importing images. Select the folder with 9455 Flickr images downloaded at the previous step, and add the ‘Image Embedding’ widget connected to ‘Import Images’ (Figure 3). ‘Image Embedding’ widget includes several pre-trained machine learning models, including Google’s InceptionV36, a convolutional 48-layered neural network trained on 1.2 million images from the ImageNet repository [3]. Orange sends photographs to the Google server, where the InceptionV36 model assigns each photo with 2048 attributes. These attributes are subject to further clustering. Pictures are not stored on Google Server; this process is secure. Figure 3. Image embedding procedure

Figure 3. Image embedding procedure

To cluster the photographs by quantitative attributes, one needs to estimate the distance between data rows (images). Orange suggests using cosine distance estimation (angle between two numeric vectors) for image clustering. We follow the same approach (Figure 4). We can explore our quantitative attributes using the ‘Data Table’ widget. Figure 4. Estimating cosine distance between photographs.png

Figure 4. Estimating cosine distance between photographs

After that, we proceed with hierarchical clustering of the distances (Figure 5) with default settings. Figure 5. Adding hierarchical clustering

Figure 5. Adding hierarchical clustering

When adding the hierarchical clustering, we can save all the quantitative attributes and information about the clusters using the ‘Save data’ widget (Figure 6). 951593cccf996e3b3280120e10ee0c1f.png

Figure 6. Exporting .xlsx spreadsheet with attributes, image and cluster information

To export our images grouped in subfolders according to the cluster name, we should use the widget ‘Select Columns’ and ensure ‘Cluster’ is in the ‘Target’ field (Figure 7). Figure 7. Choosing clusters as a target for export

Figure 7. Choosing clusters as a target for export

Please, choose the directory to save the classified images. You should write the name of the new intended folder and click ‘Save’ (Figure 8). Figure 8. Exporting classified photographs

Figure 8. Exporting classified photographs

Now, let’s explore our photographs. There are 46 clusters (C1-C46) with a very uneven distribution of photos among them. C1 contains photos of billiard players, C5 - events, C10 - flowers and plants, C12 - nighttime river landscapes, C13 - autumn park, C16 - winter landscapes, C21 - close-up wall photos, C22 - people and events, C26 - people/children, C27-28 - architecture, C30 - events and trucks, C34 - events, etc. predominantly. Therefore, we can see some logic in differences among the clusters, while there are still many content discrepancies. And here, we can proceed with training our custom image classification model.

Training image classification model with Lobe

We can benefit from unsupervised clustering in Orange because we can create our training dataset for Lobe based on that. It is entirely up to our research task which aspect of the photo or content to choose for classification. Lobe accepts folders with embedded subfolders as input for training. Therefore, we can move some photographs to the new folder from the clusters. For ease, you can find already pre-selected categories of photos via the same link (folder Lobe - train). There are 11 subfolders: architecture, art, bridges, events, flowers, food, green_spaces, nightscapes, people, transport, and winterscapes. Let’s download this folder and import it to Lobe to train our model (Figure 9).

Optimally, each category in the training dataset should have equal instances. At least five photos per category are needed to start the training, but my practice shows that the optimal number of images should be about 200 or more (e.g., 500). For now, we do not follow that rule strictly, and some classes have fewer photos than others. It makes our training dataset unbalanced (predictive power of the resulting model will not be equally distributed among classes) but still acceptable for demonstration purposes. For more information on data quantity and balance, see the following link.

Figure 9. Importing training dataset
When the model is trained (Figure 10), the overall accuracy reaches, in my case, 90% (in your case, it might be slightly different). If there are incorrect predictions, you can interactively correct them and improve the model. Lobe takes a random 80% / 20% split of the dataset for training vs evaluation (cross-validation), so the specific images used during training will differ when you have different projects and import the same folder. Figure 10. Lobe image classification model trained

Figure 10. Lobe image classification model trained

However, the model can still be optimised to improve the accuracy. Lobe supports fine-tuning the hyperparameters of the trained model, but it may take up to 1-2 hours, depending on your computer hardware. Therefore, we will skip the finetuning at the workshop, but you should always consider optimisation as the final step before exporting the trained model (Figure 11). Figure 11. Optimising image classification model

Figure 11. Optimising image classification model

Optimisation of the model in my case took about 1 hour, but the accuracy of predictions increased up to 96%. It isn’t much, but it’s honest work! Figure 12. Optimised model (took approx. 1 hour)

Figure 12. Optimised model (took approx. 1 hour)

We can export the model to one of the commonly used formats. I recommend using the TensorFlow format by default (Figure 13), which can be easily imported to Image Tools provided by Lobe developers to apply the model to the new datasets. You can find Image Tools and supporting instructions via the following GitHub repository. You must be logged in to GitHub to be able to download Image Tools. Figure 13. Exporting optimised model in TensorFlow format

Figure 13. Exporting optimised model in TensorFlow format

Finally, you can run Image Tools right after unpacking the archive. Under the ‘Model’ tab (Figure 14), you can choose the folder with unclassified images - for this purpose, download the folder ‘Lobe - unclassified’ here. Also, select the directory of the TensorFlow model exported at the previous step. Figure 14. Applying TensorFlow model to the folder with unclassified images

Figure 14. Applying TensorFlow model to the folder with unclassified images

Congratulations! Your model is now applied to the unknown data. Please, explore the quality of predictions - are you satisfied? There are still some discrepancies, but the result is much more tailored to our understanding than unsupervised clustering. Maybe by adjusting the categories of prediction and ensuring a sufficient quantity of data and diversity in the training dataset, the quality of the model will improve significantly. You can apply the image classification functionality in Orange/Lobe software to any relevant images for your study - regarding land use/land cover, presence of disasters, urban structure, etc. Good luck with your further studies and research!

References

  1. Demsar J, Curk T, Erjavec A, Gorup C, Hocevar T, Milutinovic M, Mozina M, Polajnar M, Toplak M, Staric A, Stajdohar M, Umek L, Zagar L, Zbontar J, Zitnik M, Zupan B (2013) Orange: Data Mining Toolbox in Python, Journal of Machine Learning Research 14(Aug): 2349−2353.
  2. Fox N, August T, Mancini F, Parks KE, Eigenbrod F, Bullock JM, Sutter L, Graham LJ. “photosearcher” package in R: An accessible and reproducible method for harvesting large datasets from Flickr. SoftwareX. 2020 Jul 1;12:100624.
  3. Godec P, Pančur M, Ilenič N, Čopar A, Stražar M, Erjavec A, Pretnar A, Demšar J, Starič A, Toplak M, Žagar L. Democratized image analytics by visual programming through integration of deep models and small-scale machine learning. Nature communications. 2019 Oct 7;10(1):1-7.